74 research outputs found

    Moving a print-based editorial project into elecronic form

    Get PDF

    New tricks from an old dog: An overview of TEI P5

    Get PDF

    XAIRA : software for language analysis

    Get PDF
    This paper describes a software architectiure developed at Oxford University Computing Services (OUCS) over the last decade for the analysis of large or small text corpora, in any language, using rich or only minimal XML markup

    The Evolution of the Text Encoding Initiative: From Research Project to Research Infrastructure

    Get PDF
    It is twenty-five years since the Text Encoding Initiative was first launched as a research project following an international conference funded by the US National Endowment for the Humanities. This article describes some key stages in its subsequent evolution from research project into research infrastructure. The TEI's changing nature, we suggest, is partly a consequence of its close and highly responsive relation with an active user community, which may also explain both its longevity and its effectiveness as a part of the digital humanities research infrastructure

    ¿Qué es la Iniciativa de Codificación de Textos?

    Get PDF
    Las Directrices de la Iniciativa de Codificación de Textos (TEI) se consideran desde hace tiempo el estándar de facto para la preparación de recursos digitales de texto en la comunidad de investigadores académicos. Ofrecen un abanico de posibilidades abrumador para el principiante, que refleja la enorme gama de posibles aplicaciones de la codificación de textos: desde las ediciones académicas tradicionales hasta los corpus lingüísticos, los léxicos históricos, los archivos digitales y mucho más. Apoyado en numerosos ejemplos de textos codificados con TEI procedentes de diversos campos de investigación, este libro sencillo y directo pretende ayudar al principiante a tomar sus propias decisiones entre toda la gama de posibilidades de TEI. Explica la tecnología XML utilizada por TEI en un lenguaje accesible al lector lego, ofrece una visita guiada a las numerosas partes del universo TEI y explica cómo puede personalizarse para adaptarse a las necesidades de un proyecto individual

    Resolving the Durand Conundrum

    Get PDF
    This paper proposes a minor but significant modification to the TEI ODD language and explores some of its implications. Can we improve on the present compromise whereby TEI content models are expressed in RELAX NG? A very small set of additional elements would permit the ODD language to cut its ties with any existing schema language, and thus permit it to support exactly and only the subset or intersection of their facilities which makes sense in the TEI context. It would make the ODD language an integrated and independent whole rather than an uneasy hybrid, and pave the way for future developments in the management of structured text beyond the XML paradigm

    What is TEI Conformance, and Why Should You Care?

    Get PDF
    The recommendations of the Text Encoding Initiative (TEI) seem to have become a defining feature of the methodological framework of the Digital Humanities, despite recurrent concerns that the system they define is at the same time both too rigorous for the manifold variability of humanistic text, and not precise enough to guarantee interoperability of resources defined using it. In this paper I question the utility of standardization in a scholarly context, proposing however that documentation of formal encoding practice is an essential part of scholarship. After discussing the range of information such documentation entails, I explore the notion of conformance proposed by the TEI Guidelines, suggesting that this must operate at both a technical syntactic level, and a less easily verifiable semantic level. One of the more noticeable features of the Guidelines is their desire to have (as the French say) both the butter and the money for the butter; I will suggest that this polymorphous multiplicity is an essential component of the system, and has been a key factor in determining the TEI’s continued relevance

    06491 Abstracts Collection -- Digital Historical Corpora- Architecture, Annotation, and Retrieval

    Get PDF
    From 03.12.06 to 08.12.06, the Dagstuhl Seminar 06491 ``Digital Historical Corpora - Architecture, Annotation, and Retrieval\u27\u27 was held in the International Conference and Research Center (IBFI), Schloss Dagstuhl. During the seminar, several participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section describes the seminar topics and goals in general. Links to extended abstracts or full papers are provided, if availabl

    In search of comity: TEI for distant reading

    Get PDF
    Any expansion of the TEI beyond its traditional user base involves a recognition that there are many differing answers to the traditional question “What is text, really?” We report on some work carried out in the context of the COST Action Distant Reading for European Literary History (CA16204), in particular on the TEI-conformant schemas developed for one of its principal deliverables: the European Literary Text Collection (ELTeC). The ELTeC will contain comparable corpora for each of at least a dozen European languages, each being a balanced sample of one hundred novels from the period 1840 to 1920, together with metadata concerning their production and reception. We hope that it will become a reliable basis for comparative work in data-driven textual analytics. The focus of the ELTeC encoding scheme is not to represent texts in all their original complexity, nor to duplicate the work of scholarly editors. Instead, we aim to facilitate a richer and better-informed distant reading than a transcription of lexical content alone would permit. At the same time, where the TEI encourages diversity, we enforce consistency by permitting representation of only a specific and quite small set of textual features, both structural and analytical. These constraints are expressed by a master TEI ODD, from which we derive three different schemas by ODD chaining, each associated with appropriate documentation